Stellar photospheric activity is known to limit the detection and characterisation of extra-solar planets. In particular, the study of Earth-like planets around Sun-like stars requires data analysis methods that can accurately model the stellar activity phenomena affecting radial velocity (RV) measurements. Gaussian Process Regression Networks (GPRNs) offer a principled approach to the analysis of simultaneous time-series, combining the structural properties of Bayesian neural networks with the non-parametric flexibility of Gaussian Processes. Using HARPS-N solar spectroscopic observations encompassing three years, we demonstrate that this framework is capable of jointly modelling RV data and traditional stellar activity indicators. Although we consider only the simplest GPRN configuration, we are able to describe the behaviour of solar RV data at least as accurately as previously published methods. We confirm the correlation between the RV and stellar activity time series reaches a maximum at separations of a few days, and find evidence of non-stationary behaviour in the time series, associated with an approaching solar activity minimum.
translated by 谷歌翻译
Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.
translated by 谷歌翻译
SchNetPack is a versatile neural networks toolbox that addresses both the requirements of method development and application of atomistic machine learning. Version 2.0 comes with an improved data pipeline, modules for equivariant neural networks as well as a PyTorch implementation of molecular dynamics. An optional integration with PyTorch Lightning and the Hydra configuration framework powers a flexible command-line interface. This makes SchNetPack 2.0 easily extendable with custom code and ready for complex training task such as generation of 3d molecular structures.
translated by 谷歌翻译
We developed a simulator to quantify the effect of changes in environmental parameters on plant growth in precision farming. Our approach combines the processing of plant images with deep convolutional neural networks (CNN), growth curve modeling, and machine learning. As a result, our system is able to predict growth rates based on environmental variables, which opens the door for the development of versatile reinforcement learning agents.
translated by 谷歌翻译
Synthetic data generation has recently gained widespread attention as a more reliable alternative to traditional data anonymization. The involved methods are originally developed for image synthesis. Hence, their application to the typically tabular and relational datasets from healthcare, finance and other industries is non-trivial. While substantial research has been devoted to the generation of realistic tabular datasets, the study of synthetic relational databases is still in its infancy. In this paper, we combine the variational autoencoder framework with graph neural networks to generate realistic synthetic relational databases. We then apply the obtained method to two publicly available databases in computational experiments. The results indicate that real databases' structures are accurately preserved in the resulting synthetic datasets, even for large datasets with advanced data types.
translated by 谷歌翻译
Artificial Intelligence (AI) and its data-centric branch of machine learning (ML) have greatly evolved over the last few decades. However, as AI is used increasingly in real world use cases, the importance of the interpretability of and accessibility to AI systems have become major research areas. The lack of interpretability of ML based systems is a major hindrance to widespread adoption of these powerful algorithms. This is due to many reasons including ethical and regulatory concerns, which have resulted in poorer adoption of ML in some areas. The recent past has seen a surge in research on interpretable ML. Generally, designing a ML system requires good domain understanding combined with expert knowledge. New techniques are emerging to improve ML accessibility through automated model design. This paper provides a review of the work done to improve interpretability and accessibility of machine learning in the context of global problems while also being relevant to developing countries. We review work under multiple levels of interpretability including scientific and mathematical interpretation, statistical interpretation and partial semantic interpretation. This review includes applications in three areas, namely food processing, agriculture and health.
translated by 谷歌翻译
To analyze this characteristic of vulnerability, we developed an automated deep learning method for detecting microvessels in intravascular optical coherence tomography (IVOCT) images. A total of 8,403 IVOCT image frames from 85 lesions and 37 normal segments were analyzed. Manual annotation was done using a dedicated software (OCTOPUS) previously developed by our group. Data augmentation in the polar (r,{\theta}) domain was applied to raw IVOCT images to ensure that microvessels appear at all possible angles. Pre-processing methods included guidewire/shadow detection, lumen segmentation, pixel shifting, and noise reduction. DeepLab v3+ was used to segment microvessel candidates. A bounding box on each candidate was classified as either microvessel or non-microvessel using a shallow convolutional neural network. For better classification, we used data augmentation (i.e., angle rotation) on bounding boxes with a microvessel during network training. Data augmentation and pre-processing steps improved microvessel segmentation performance significantly, yielding a method with Dice of 0.71+/-0.10 and pixel-wise sensitivity/specificity of 87.7+/-6.6%/99.8+/-0.1%. The network for classifying microvessels from candidates performed exceptionally well, with sensitivity of 99.5+/-0.3%, specificity of 98.8+/-1.0%, and accuracy of 99.1+/-0.5%. The classification step eliminated the majority of residual false positives, and the Dice coefficient increased from 0.71 to 0.73. In addition, our method produced 698 image frames with microvessels present, compared to 730 from manual analysis, representing a 4.4% difference. When compared to the manual method, the automated method improved microvessel continuity, implying improved segmentation performance. The method will be useful for research purposes as well as potential future treatment planning.
translated by 谷歌翻译
机器学习潜力是分子模拟的重要工具,但是由于缺乏高质量数据集来训练它们的发展,它们的开发阻碍了它们。我们描述了Spice数据集,这是一种新的量子化学数据集,用于训练与模拟与蛋白质相互作用的药物样的小分子相关的潜在。它包含超过110万个小分子,二聚体,二肽和溶剂化氨基酸的构象。它包括15个元素,带电和未充电的分子以及广泛的共价和非共价相互作用。它提供了在{\ omega} b97m-d3(bj)/def2-tzVPPD理论水平以及其他有用的数量(例如多极矩和键阶)上计算出的力和能量。我们在其上训练一组机器学习潜力,并证明它们可以在化学空间的广泛区域中实现化学精度。它可以作为创建可转移的,准备使用潜在功能用于分子模拟的宝贵资源。
translated by 谷歌翻译
开发有效的自动分类器将真实来源与工件分开,对于宽场光学调查的瞬时随访至关重要。在图像差异过程之后,从减法伪像的瞬态检测鉴定是此类分类器的关键步骤,称为真实 - 博格斯分类问题。我们将自我监督的机器学习模型,深入的自组织地图(DESOM)应用于这个“真实的模拟”分类问题。 DESOM结合了自动编码器和一个自组织图以执行聚类,以根据其维度降低的表示形式来区分真实和虚假的检测。我们使用32x32归一化检测缩略图作为底部的输入。我们展示了不同的模型训练方法,并发现我们的最佳DESOM分类器显示出6.6%的检测率,假阳性率为1.5%。 Desom提供了一种更细微的方法来微调决策边界,以确定与其他类型的分类器(例如在神经网络或决策树上构建的)结合使用时可能进行的实际检测。我们还讨论了DESOM及其局限性的其他潜在用法。
translated by 谷歌翻译
现有的数据驱动和反馈流量控制策略不考虑实时数据测量的异质性。此外,对于缺乏数据效率,传统的加固学习方法(RL)方法通常会缓慢收敛。此外,常规的最佳外围控制方案需要对系统动力学的精确了解,因此对内源性不确定性会很脆弱。为了应对这些挑战,这项工作提出了一种基于不可或缺的增强学习(IRL)的方法来学习宏观交通动态,以进行自适应最佳周边控制。这项工作为运输文献做出了以下主要贡献:(a)开发连续的时间控制,并具有离散增益更新以适应离散时间传感器数据。 (b)为了降低采样复杂性并更有效地使用可用数据,将体验重播(ER)技术引入IRL算法。 (c)所提出的方法以“无模型”方式放松模型校准的要求,该方式可以稳健地进行建模不确定性,并通过数据驱动的RL算法增强实时性能。 (d)通过Lyapunov理论证明了基于IRL的算法和受控交通动力学的稳定性的收敛性。最佳控制定律被参数化,然后通过神经网络(NN)近似,从而缓解计算复杂性。在不需要模型线性化的同时,考虑了状态和输入约束。提出了数值示例和仿真实验,以验证所提出方法的有效性和效率。
translated by 谷歌翻译